Do open data impact citizens’ behavior? Assessing face mask panic buying behaviors during the Covid-19 pandemic

Data are essential for digital solutions and supporting citizens’ everyday behavior. Open data initiatives have expanded worldwide in the last decades, yet investigating the actual usage of open data and evaluating their impacts are insufficient. Thus, in this paper, we examine an exemplary use case of open data during the early stage of the Covid-19 pandemic and assess its impacts on citizens. Based on quasi-experimental methods, the study found that publishing local stores’ real-time face mask stock levels as open data may have influenced people’s purchase behaviors. Results indicate a reduced panic buying behavior as a consequence of the openly accessible information in the form of an online mask map. Furthermore, the results also suggested that such open-data-based countermeasures did not equally impact every citizen and rather varied among socioeconomic conditions, in particular the education level.

where ShortPost jt is the interaction term between whether an area is treated (mask map use at 1 % or 3 % threshold) and the short term post-policy change period (April 9 and April 10). While ShortPost jt is the interaction term between whether an area is treated (mask map use at 1 % or 3 % threshold) and the medium term post-policy change period (April 11 and April 15). Other varaibles are defined in the same way as in Eq.
(2). For the statistical test, standard errors are clustered at the area level. Figure S4 reports our findings (Eq. 5), comparing short-term and medium-term impacts of the mask map. The reduction in overall sold mask numbers was greater among mask map use areas' stores within two days of the policy changes. However, the estimations only among higher college graduate rate areas found smaller impacts of the mask map use on reducing mask purchasing. In addition, we found almost no impacts only within lower college graduate areas' stores. Although overall trends of higher mask map use areas have experienced less panic buying, these findings also suggest that it might have been driven by other socioeconomic conditions, such as education and economic differences, as discussed in the previous subsection.

D. Alternative identification designs
In the main text, we compare stores in mask map use areas (treated areas) and those in no/lower map use areas (not-treated areas). As discussed in the main text, higher college graduation rates have a relation to lower sold mask numbers (reducing panic-buying behaviors) after loosening mask purchase restrictions. Here, to complementary check this trend, we add the dummy variables of higher college graduate rate areas in the DiD model as shown below: where HighEdu j is dummy variables of higher college graduates. We test two types of dummy variables of higher college graduate areas: First, we consider areas whose college graduation rates are above the average (33%) of the college graduate percentage as higher college graduates areas. Secondly, we consider areas whose college graduation rates are above 40 % as higher educated areas. For Use jt , we also test two types of thresholds (1% and 3%) as same as the main analysis (Eq. 2). In addition, we conducted the event-study approach with the higher educated areas as a treatment to test the parallel trend assumption. By following the event study design of Eq. 3, the equation is as the following:  where HighEdu jt,k are a set of dummy variables indicating the treatment status at different periods. We investigate 13 days before and 7 days after the mask purchase policy was loosened (April 9th, 2020). The dummy for m = 1 is omitted in Eq. (7) so that the higher college graduation impacts are relative to the period one day before the policy change. The parameter of interest t k estimates the effects of the higher college graduation rates m days before/after the policy changes, testing whether the alternative treatment (higher college graduation rate) affects the number of sold masks before loosening the policy. Intuitively, the coefficient t k measures the difference in the number of sold mask amounts between areas with higher college graduation rates and otherwise in period k relative to the difference one day before the policy change.
As a result, with the alternative model (Eq. 6), we still found that the interaction terms between mask map use and the higher college graduation rate have significant negative impacts on reducing sold mask numbers ( Figure S5 and Table S5). This alternative approach also supports that the mask map usage has an impact on reducing panic buying behaviors. The parallel assumption test results are shown in Figure S6 and  , and histograms in blue describe those of not-treated stores (stores outside of the mask map use areas). All histograms are created after converting variables in z-score (mean = 0, standard deviation = 1). Figure S3. Histogram of outcome and covariates according to treated and non-treated groups (in the case of 1 % threshold for the definition of mask map use) and areas' college graduation rates. Each panel depicts each variable's histograms. Four types of histograms are included in each panel. Red histograms represent those of treated stores (stores in the mask map use areas) in higher college graduate rate areas. Blue histograms those of represent treated stores in lower college graduate rate areas. Green histograms represent those of non-treated stores in higher college graduate rate areas. Orange histograms represent those of non-treated stores in lower college graduate rate areas. All histograms are created after converting variables in z-score (mean = 0, standard deviation = 1).

5/12
Area name  Table S2. Each area's number of stores, treated status, and higher college graduation rate status. In Column 2, Xindicates the corresponding area is regarded as treated (the mask map use area with 1 % threshold). In Column 3, Xindicates the corresponding area is regarded as treated (the mask map use area with 3% threshold). In Column 4, Xindicates the corresponding area is regarded as the higher college graduation rate area (above the third quartile of all areas' college graduation rates). In this study, we excluded areas whose mask map usage data are unavailable. For locations of the stores, see Figure 1.  Table S3. Estimated coefficients of DiD approach with interaction terms between socioeconomic covariates and the treatment dummy (Eq. 4). Each column is separate DiD model implementation result. Columns 1-2 show those with all stores in our samples. Columns 3-4 show those with stores in the higher college graduation rate. Columns 5-6 show those with stores in the lower college graduation rate. Standard errors are in parentheses under the corresponding coefficients. All models present standard errors clustered at area level. Treatment dummy thresholds have been changed to 1 % and 3 % for robust tests. In addition, we apply each model for subgroups of stores (stores in higher college graduation rate areas and stores in lower college graduation rate areas). Empty cells indicate the corresponding variables were dropped due to high correlations with other variables. For visualization of coefficients, see Figure 3. ⇤⇤⇤ : p < 0.01, ⇤⇤ : p < 0.05, ⇤ : p < 0.1.   Figure S5. Estimated coefficients of alternative model estimations (Eq. 6). The variables except dummy variables are standardized to have a mean of 0 and a standard deviation of 1. Date-specific effects are included in all the estimations, and standard errors are clustered at the area level. In Panel a, the treatment dummy, higher college graduates, is 1 if an area's college graduation rate is above the average (33%). In Panel b, the treatment dummy is 1 if an area's college graduation rate is above 40 %. In both panels, red lines are the estimated coefficients when the areas whose mask map use rates are more than 1% are considered as the mask map use areas. The blue lines are the estimated coefficients when the areas whose mask map use rates are more than 3 % are considered as the as mask map use areas. See Table (S5) for the full results.  Figure S6. Alternative identification approach's event study model (Eq. 7). Each panel represents separate regression using the event study approach. All panel's outcome variable is sold mask numbers per household. The estimated coefficients and their 95% confidence intervals are plotted. Vertical orange dashed lines indicate the timing when the government loosened the mask policy. Lk and Dk in the x-axis represent k days before and after the policy change. The dummy variable of the one day before the policy implementation (k = 1) is omitted from the regressions. Panel a shows the estimated coefficients of the Eq. (7) with the treatment threshold as 33%. Panel b shows the estimated coefficients of the Eq. 7 with the treatment threshold as 40 %. In the equations, we use various covariates (e.g., maximum mask stock per day per store, new Covid-19 cases per area per day, and college graduate percentage per area). The date fixed effects are included and standard errors are clustered at the area level. See Table S6 for the full results.  Columns 1-3, the results when the thresholds to determine mask map use area is 1 % are shown. In Columns 4-6, the results when the thresholds to determine mask map use area is 3 % are shown. In Columns 1 and 4, all stores in our sample are used. In Columns 2 and 5, the stores in higher college graduation rate areas are used. In Columns 3 and 6, the stores in lower college graduation rate areas are used. Lk is the coefficients of the dummy variables indicating k day before the policy was loosened (April 9, 2020). Dk is the coefficients of the dummy variables indicating k day after the policy change. All models present standard errors clustered at area level. Empty cells indicate the corresponding covariates were dropped due to high correlations with other variables. Standard errors are in parentheses under corresponding estimated coefficients. For the visualization of the results, see Figure 2. ⇤⇤⇤ : p < 0.01, ⇤⇤ : p < 0.05, ⇤ : p < 0.  Table S5. Estimated coefficients of the alternative model (Eq. 6). Each column show separate DiD estimation result. In Columns 1-2, the results with Eq. (6) when the definition of higher college graduation rate areas is above the average (33 %). In Columns 3-4, the results with Eq. (6) when the definition of higher college graduation rate areas is above 40 %. For visualization of the estimated coefficients, see Figure S5. Standard errors are clustered at the area level. Standard errors are shown in parentheses under the corresponding coefficients. Empty cells indicate the corresponding covariates were dropped due to high correlations with other variables. ⇤⇤⇤ : p < 0.01, ⇤⇤ : p < 0.05, ⇤ : p < 0.1.

11/12
Alternative treatment's threshold ( Table S6. Alternative identification approach's event study model (Eq. 7). Lk is the coefficients of the dummy variables indicating k day before the policy was loosened (April 9, 2020). Dk is the coefficients of the dummy variables indicating k day after the policy was loosened. For visualizing the estimated coefficients of the dummy variables, see Figure S6. Standard errors are clustered at the area level. Standard errors are shown in parentheses under the corresponding coefficients. ⇤⇤⇤ : p < 0.01, ⇤⇤ :